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CLAIMS 

We claim; 

1. A method for the automatic harvesting and qualification of 
dynamic database content comprising: 

obtaining an initial categorization structure for organizing a 
plurality of subject areas of information; 

obtaining a plurality of parametric information lists for 
optimizing operation to a user's requirements; 

acquiring a listing of a plurality of qualified databases from 
said candidate database listing by matching each one of a candidate 
databases to said plurality of subject areas; 

obtaining a query from the user, said query being associated 
with a subject area; 

submitting said query to said plurality of qualified databases; 

acquiring a collection of responsive content from said 
plurality of qualified databases; 

indexing said responsive content to form an index of 
facilitating searching said collection of responsive content; 

publishing a summary of said collection of responsive content 
for review by the user. 

2. The method of claim 1, wherein said step of obtaining a 
plurality of parametric information lists further comprises: 

obtaining a candidate database listing providing a plurality of 
databases to be considered for said step of acquiring a plurality of 
qualified databases; 

obtaining an exclusion list providing a plurality of terms and 
sources to inhibit associations for said step of acquiring a 
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collection of responsive content; 

obtaining an inclusion list providing a plurality of terms and 
sources restricting associations for said step of acquiring a 
collection of responsive content; 

obtaining a stop list providing a plurality of terms to be 
excluded for said step of indexing said responsive content. 

3. The method of claim 1, wherein said step of acquiring a 
plurality of qualified databases further comprises: 

capturing an initial page from each one of said plurality of 
candidate databases; 

evaluating said initial page for relevancy to said each one of 

said subject areas; 

qualifying databases according to relevance to said subject 

areas; 

associating said qualified databases with said subject areas. 

4. The method of claim 3, further comprising: 

obtaining a database relevancy parameter for restricting the 
qualification of databases below a minimum threshold value; 

comparing the relevance of each initial page to said relevancy 
parameter; 

removing each candidate database with a relevancy below said 
minimum threshold value from qualification. 

5. The method of claim 1, wherein said step of acquiring a 
plurality of qualified databases further comprises: 

submitting a query to each one of said databases; 
capturing a plurality of pieces of responsive content provided 
by each one of said databases; 

evaluating each one of said plurality of pieces of responsive 
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content for relevancy to said query; 

assigning a numerical score to each one of said plurality of 
pieces of responsive content, said numerical score representing a 
degree of relevance to said query; 

developing an aggregate score for each one of said databases; 

selecting databases to be polled for content based upon said 
aggregate score, 

6. The method of claim 5, wherein said step of capturing a 
plurality of pieces of responsive content further comprises: 

obtaining a content parameter limiting the number of pieces 
of content to be captured from each one of said databases; 

obtaining an initial v^eighting of each one of said pieces of 
responsive content from said database; 

selecting a quantity of pieces of responsive content limited by 
said content parameter such that pieces of responsive content with a 
relatively greater initial weighting are selected before pieces of 
responsive content with a relatively lesser initial weighting. 

7. The method of claim 1, wherein said step of acquiring a 
plurality of qualified databases further comprises: 

capturing an initial page from each one of said plurality of 
candidate databases; 

evaluating said initial page for relevancy to said each one of 
said subject areas; 

obtaining a database relevancy parameter for restricting the 
qualification of databases below a minimum threshold value; 

comparing the relevance of each initial page to said relevancy 
parameter; 

removing each candidate database with a relevancy below said 
minimum threshold value from qualification; 
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qualifying databases according to relevance to said subject 

areas; 

submitting a query to each one of said databases; 

capturing a plurality of pieces of responsive content provided 
by each one of said databases; 

obtaining a content parameter limiting the number of pieces 
of content to be captured from each one of said databases; 

obtaining an initial weighting of each one of said pieces of 
responsive content from said database; 

selecting a quantity of pieces of responsive content limited by 
said content parameter such that pieces of responsive content with a 
relatively greater initial weighting are selected before pieces of 
responsive content with a relatively lesser initial weighting; 

evaluating each one of said plurality of pieces of responsive 
content for relevancy to said query; 

assigning a numerical score to each one of said plurality of 
pieces of responsive content, said numerical score representing a 
degree of relevance to said query; 

developing an aggregate score for each one of said databases; 

selecting databases to be polled for content based upon said 
aggregate score; 

associating said qualified databases with said subject areas. 

8. The method of claim 1, wherein said step of acquiring a 
plurality of qualified databases further comprises: 

analyzing an initial page from each one of said plurality of 
qualified databases for formatting; 

determining an input location for passing queries by said 
initial page to each one of said plurality of databases; 

determining results locations for capturing search results 
returned from each one of said plurality of databases; 
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recording said input location and said results locations for 
use in formatting queries for each one of said databases. 

9. The method of claim 1, wherein said step of acquiring a 
collection of responsive content further comprises: 

comparing each piece of responsive content to each one of 
said subject areas in said initial categorization structure; 

matching each piece of responsive content to subject areas 
based on relevance of the responsive content to the subject areas; 

filtering matches to optimize said categorization structure, 

10. The method of claim 9, wherein said step of filtering 
matches further comprises: 

removing duplicate pieces of responsive content; 

obtaining a population parameter for limiting a number of 
pieces of responsive content which may be matched to any one 
subject area; 

obtaining an occurrence parameter for limiting a number of 
subject areas to which any one piece of responsive content may be 
matched; 

restricting matches for each one of said subject areas 
according to said occurrence parameter and said population 
parameter. 

1 1 . The method of claim 9, wherein said step of filtering 
matches further comprises: 

obtaining an exclusion list to inhibit matches based on 
predetermined words and sources; 

obtaining an inclusion list to restrict matches based on 
predetermined words and sources; 

matching each piece of responsive content with subject areas 
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according to said exclusion list and said inclusion list. 

12. The method of claim 9, further comprising: 

creating a categorization file for recording matches between 
each piece of responsive content and each subject area; 

saving said categorization file to a storage medium for use in 
searching said collection of responsive content, 

13. The method of claim 1, wherein said step of indexing said 
responsive content further comprises: 

obtaining a stop list providing a list of words not to be 
indexed; 

parsing each piece of responsive content into constituent 
words; 

eliminating words of said responsive content occurring on 
said stop lists; 

recording a location of every occurrence of constituent words 
in said collection of responsive content, 

14. The method of claim 1, wherein said step of publishing a 
summary further comprises: 

determining if a summary is provided for each piece of said 
responsive content; 

examining each piece of said responsive content for keywords 
associated with each subject area; 

developing a keyword summary score for each piece of 
responsive content; 

examining each piece of said responsive content for relevant 
extracts forming an extract summary; 

developing an extract score for each piece of responsive 

content; 
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comparing said keyword summary score to said extract score 
for a summary composite score; 

selecting said keyword summary if a predetermined summary 
value is exceeded by said summary composite score; 

selecting said extract summary if a predetermined summary 
value if not exceeded by said summary composite score. 

15. A system for the automatic harvesting and qualification of 
dynamic database content comprising: 

a computer system having a communication means for 
communicating with at least one other computer including a 
database to facilitate the two-way flow of information between said 
computer system and the at least one other computer; 

said computer system having a storage means for retention 
and recall of data communicated by or to the at least one other 
computer; 

said computer system having a processing means for executing 
multiple software modules and performing comparisons between a 
user supplied query and a plurality of documents found in at least 
one other computer; 

an index for storing a plurality of pre-approved internet sites 
to be included in a series of queries; 

a configuration module adapted for translating a generic query 
into site-specific dialects such tha a single user defined query may 
be directed to multiple sites automatically; 

a selection module adapted for characterizing said plurality of 
documents returned by the database of the at least one other 
computer and associated with said user defined query; 

a results index to allow for rapid recovery of specific portions 
of any one of said plurality of documents characterized by said 
selection module; and 
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a generator module for automatically generating at least one 
results page for the user conveying information associated with any 
one of said plurality of documents associated with said query. 
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